Challenges in Predicting Disease State with Apache Spark
نویسندگان
چکیده
منابع مشابه
Balanced Graph Partitioning with Apache Spark
A significant part of the data produced every day by online services is structured as a graph. Therefore, there is the need for efficient processing and analysis solutions for large scale graphs. Among the others, the balanced graph partitioning is a well known NP-complete problem with a wide range of applications. Several solutions have been proposed so far, however most of the existing state-...
متن کاملSPARQL query processing with Apache Spark
The number and the size of linked open data graphs keep growing at a fast pace and confronts semantic RDF services with problems characterized as Big data. Distributed query processing is one of them and needs to be efficiently addressed with execution guaranteeing scalability, high availability and fault tolerance. RDF data management systems requiring these properties are rarely built from sc...
متن کاملBenchmarking Apache Spark with Machine Learning Applications
We benchmarked Apache Spark with a popular parallel machine learning training application, Distributed Stochastic Gradient Descent for Matrix Factorization [5] and compared the Spark implementation with alternative approaches for communicating model parameters, such as scheduled pipelining using POSIX socket or MPI, and distributed shared memory (e.g. parameter server [13]). We found that Spark...
متن کاملApproximate Stream Analytics in Apache Flink and Apache Spark Streaming
Approximate computing aims for efficient execution of workflows where an approximate output is sufficient instead of the exact output. The idea behind approximate computing is to compute over a representative sample instead of the entire input dataset. Thus, approximate computing — based on the chosen sample size — can make a systematic trade-off between the output accuracy and computation effi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: MOJ Proteomics & Bioinformatics
سال: 2016
ISSN: 2374-6920
DOI: 10.15406/mojpb.2016.03.00076